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METHOD AND APPARATUS FOR COPYING In one embodiment, a second process is used to store the 

DATA THAT RESIDES IN A DATABASE particular set of data that was extracted into a second 

_ database. 
HELD OF THE INVENTION According to another aspect of the invention, a method 

The present invention relates to the copying of data and 5 and apparatus for producing a copy of data from a first 
more specifically to producing a copy of data that resides in database is provided. 

a database. According to the method, a first set of data in the first 

BACKGROUND OF THE INVENTION d , atab , a f » locked Mter lock f « set of data, a 

plurality of processes are requested to obtain snapshot times 

Planning software is used by manufacturers to aid in the 10 from a database server associated with said first database. 

manufacturing process. Based upon the desired product The snapshot times cause all subsequent reads of the first 

output and the components needed for each product, the database by the plurality of processes to return data from the 

planning software generates a schedule of what components first database as of the snapshot times. After waiting a 

need to be manufactured and by when, and what materials particular period of time for the plurality of processes to be 

need to be procured and by when. This schedule is generated 15 assigned snapshot times, the locks on the first set of data in 

based upon data stored in a database. The planning process the first database are released. 

is often complicated and, for complex products, can take ^ pi ura lity of processes that were successful in obtain- 
many hours to complete. j n g a snapshot time within the particular period of time are 

A typical requirement imposed by the planning software ^ used to extract a copy of the first set of data from the first 
is that it needs to do its processing based on a single database. The copy of the first set of data is then separately 
consistent version of the database. If one process of the stored from the first of data, 
planning software is reading one version of the database 

while another process is reading an updated version of the BRIEF DESCRIPTION OF THE DRAWINGS 

database serious errors such as double counting can occur ^ t fe mustrated b of k 

As a result, planning software requires tot data be provided and J by rf in fc fi of ±c accom . 

from a single frozen version of the database m order to panvingdrawingsandinwhichlike refer ence numerals refer 
operate properly. to similar elements and ^ which: 

As is well known, a database or a selected subset thereof 1 . M , - ewt „ . « 

' . . , • i i 11 c *x. FIG. 1 is a block diagram oi a system mat may be 

can be frozen by obtaining exclusive locks on all or the ^ n . . . t f . i 

tables in the database or the selective subset. Once locked, *° Programmed to implement the present invention; 
the tables can be processed by the planning software to carry FIG. 2 is a block diagram of a system that is used for 
out the planning process. However, as noted above, the producing a copy of a database in accordance with an 
planning process can take many hours to complete. Many embodiment of the present invention; 

companies, especially those having offices around the world, 35 FIG. 2A illustrates the use of a snapshot time in accor- 

cannot afford to lock their tables for extended periods of ° dance with an embodiment of the invention; 
time. Hence, locking tables in this manner is often not a fig. 2B illustrates the locking a database to obtain 
viable solution. snapshot times that correspond to a single database state; 

Another possible solution is to simply make a copy of the piQ 3 A is a portion of a flow diagram illustrating a 

database prior to running the planning software. The prob- 40 method for producing a copy of a database without retaining 
lem with this solution is that for large databases, the copying a i oc fc on the database tables according to an embodiment of 
process itself can take several hours. During this time, the the invention; 

tables need to be locked to ensure a frozen state. As long as FIG 3B fe another portion of a fiow diagram illustrating 

the tables are locked, no updates can be made. Hence, this a method for producing a copy of a database without 

solution suffers from the same shortcomings, albeit to a A5 retam ; ng a i oc k on tne database tables according to an 

lesser degree, as the locking solution. embodiment of the invention; 

Based on the foregoing, it is clearly desirable to provide mQ 4 fa a bbck of a system &at fc used for 

a mechanism for obtaining a single frozen version of the producing a ^pv of a database in accordance with an 

database, or a subset thereof, without locking tables m the embodiment of ^ present invention; 

database for an extended period of time. 50 . . , a .„ . + . 

r FIG. 5A is a portion of a flow diagram illustrating a 

SUMMARY OF THE INVENTION method for producing a copy of a database without locking 

According to one aspect of the invention, a method and tbe database tables » 
apparatus for supplying a consistent set of data to a software FIG. 5B is another portion of a flow diagram illustrating 

application is provided. 55 a method for producing a copy of a database without locking 

According to the method, a software application is the database tables; and 
launched that requires a particular set of data contained in a FIG. 6 is a block diagram of alternate system for produc- 
first database. Once the particular set of data is identified, a ing a copy of a database in accordance with an embodiment 
first process is requested to obtain a snapshot time from a of the present invention, 

database server associated with the first database. The snap- 60 DETAILED DESCRIPTION OF THE 

shot time causes all subsequent reads of the first database by PREFERRED EMBODIMENT 

the first process to return data that reflects a database state ' ^ Ji L ^ A ~ ^ 

associated with the snapshot time. After the first process A method and apparatus for producing a consistent copy 
obtains the snapshot time, the first process extracts the of a database, or portion thereof, is described. In the fol- 

particular set of data from the first database. The software 65 lowing description, for the purposes of explanation, numer- 
application is then supplied with the particular set of data ous specific details are set forth in order to provide a 
that was extracted from the first database. thorough understanding of the present invention. It will be 
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apparent, however, to one skilled in the art that the present 
invention may be practiced without these specific details. In 
other instances, well-known structures and devices are 
shown in block diagram form in order to avoid unnecessarily 
obscuring the present invention. 5 

Hardware Overview 

FIG. 1 is a block diagram that illustrates a computer 
system 100 upon which an embodiment of the invention 
may be implemented. Computer system 100 includes a bus 
102 or other communication mechanism for communicating 
information, and a processor 104 coupled with bus 102 for 
processing information. Computer system 100 also includes 
a main memory 106, such as a random access memory ^ 
(RAM) or other dynamic storage device, coupled to bus 102 
for storing information and instructions to be executed by 
processor 104. Main memory 106 also may be used for 
storing temporary variables or other intermediate informa- 
tion during execution of instructions to be executed by ^ 
processor 104. Computer system 100 further includes a read 
only memory (ROM) 108 or other static storage device 
coupled to bus 102 for storing static information and instruc- 
tions for processor 104. A storage device 110, such as a 
magnetic disk or optical disk, is provided and coupled to bus ^ 
102 for storing information and instructions. 

Computer system 100 may be coupled via bus 102 to a 
display 112, such as a cathode ray tube (CRT), for displaying 
information to a computer user. An input device 114, includ- 
ing alphanumeric and other keys, is coupled to bus 102 for 3Q 
communicating information and command selections to 
processor 104. Another type of user input device is cursor 
control 116, such as a mouse, a trackball, or cursor direction 
keys for communicating direction information and com- 
mand selections to processor 104 and for controlling cursor 35 
movement on display 112. This input device typically has 
two degrees of freedom in two axes, a first axis (e.g., x) and 
a second axis (e.g., y), that allows the device to specify 
positions in a plane. 

The invention is related to the use of computer system 100 40 
for producing a copy of a database. According to one 
embodiment of the invention, a copy of a database is 
produced by computer system 100 in response to processor 
104 executing one or more sequences of one or more 
instructions contained in main memory 106. Such instruc- 45 
tions may be read into main memory 106 from another 
computer-readable medium, such as storage device 110. 
Execution of the sequences of instructions contained in main 
memory 106 causes processor 104 to perform the process 
steps described herein. In alternative embodiments, hard- 50 
wired circuitry may be used in place of or in combination 
with software instructions to implement the invention. Thus, 
embodiments of the invention are not limited to any specific 
combination of hardware circuitry and software. 

The term "computer-readable medium" as used herein 55 
refers to any medium that participates in providing instruc- 
tions to processor 104 for execution. Such a medium may 
take many forms, including but not limited to, non-volatile 
media, volatile media, and transmission media. Non-volatile 
media includes, for example, optical or magnetic disks, such 60 
as storage device 110. Volatile media includes dynamic 
memory, such as main memory 106. Transmission media 
includes coaxial cables, copper wire and fiber optics, includ- 
ing the wires that comprise bus 102. Transmission media can 
also take the form of acoustic or light waves, such as those 65 
generated during radio-wave and infra-red data communi- 
cations. 
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Common forms of computer- readable media include, for 
example, a floppy disk, a flexible disk, hard disk, magnetic 
tape, or any other magnetic medium, a CDROM, any other 
optical medium, punchcards, papertape, any other physical 
medium with patterns of holes, a RAM, a PROM, and 
EPROM, a FLASH-EPROM, any other memory chip or 
cartridge, a carrier wave as described hereinafter, or any 
other medium from which a computer can read. 

Various forms of computer readable media may be 
involved in carrying one or more sequences of one or more 
instructions to processor 104 for execution. For example, the 
instructions may initially be carried on a magnetic disk of a 
remote computer. The remote computer can load the instruc- 
tions into its dynamic memory and send the instructions over 
a telephone line using a modem. A modem local to computer 
system 100 can receive the data on the telephone line and 
use an infra-red transmitter to convert the data to an infra-red 
signal. An infra-red detector can receive the data carried in 
the infra-red signal and appropriate circuitry can place the 
data on bus 102. Bus 102 carries the data to main memory 
106, from which processor 104 retrieves and executes the 
instructions. The instructions received by main memory 106 
may optionally be stored on storage device 110 either before 
or after execution by processor 104. 

Computer system 100 also includes a communication 
interface 118 coupled to bus 102. Communication interface 
118 provides a two-way data communication coupling to a 
network link 120 that is connected to a local network 122. 
For example, communication interface 118 may be an inte- 
grated services digital network (ISDN) card or a modem to 
provide a data communication connection to a correspond- 
ing type of telephone line. As another example, communi- 
cation interface 118 may be a local area network (LAN) card 
to provide a data communication connection to a compatible 
LAN. Wireless links may also be implemented. In any such 
implementation, communication interface 118 sends and 
receives electrical, electromagnetic or optical signals that 
carry digital data streams representing various types of 
information. 

Network link 120 typically provides data communication 
through one or more networks to other data devices. For 
example, network link 120 may provide a connection 
through local network 122 to a host computer 124 or to data 
equipment operated by an Internet Service Provider (ISP) 
126. ISP 126 in turn provides data communication services 
through the world wide packet data communication network 
now commonly referred to as the "Internet" 128. Local 
network 122 and Internet 128 both use electrical, electro- 
magnetic or optical signals that carry digital data streams. 
The signals through the various networks and the signals on 
network link 120 and through communication interface 118, 
which carry the digital data to and from computer system 
100, are exemplary forms of carrier waves transporting the 
information. 

Computer system 100 can send messages and receive 
data, including program code, through the network(s), net- 
work link 120 and communication interface 118. In the 
Internet example, a server 130 might transmit a requested 
code for an application program through Internet 128, ISP 
126, local network 122 and communication interface 118. In 
accordance with the invention, one such downloaded appli- 
cation provides for producing a copy of a database as 
described herein. 

The received code may be executed by processor 104 as 
it is received, and/or stored in storage device 110, or other 
non- volatile storage for later execution. In this manner, 
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computer system 100 may obtain application code in the 
form of a carrier wave. 

Functional Overview 

The present invention provides a mechanism for produc- 
ing a consistent copy of a database, or portion thereof, 
without locking the database or desired portion for an 
extended period of time. For the purpose of explanation, 
embodiments of the invention shall be described with ref- 
erence to a relational database that stores data in tables. The 
database portion of which a consistent copy is required is 
therefore referred to herein as the "desired tables". However, 
the techniques described herein are not limited to tables or 
relational databases. 

In one embodiment, the following steps are performed to 
obtain a snapshot (consistent copy) of a database: 

(1) the desired tables (e.g. the tables that contain infor- 
mation required by a memory-based planner) are identified; 

(2) the desired tables are locked to prevent them from 
being updated; 

(3) a coordinator requests a plurality of snapshot worker 
processes to obtain snapshot times, where the snapshot times 
allow the snapshot workers to obtain data from the database 
as of a particular state of the database; 

(4) the coordinator then waits a certain period of time for 
the requested plurality of snapshot workers to obtain snap- 
shot times; 

(5) the coordinator then releases the locks on the desired 
tables; 

(6) using the snapshot workers that obtained a snapshot 
time before the locks were released (the "successful snap- 
shot workers"), a copy of the desired tables is produced 
which represents the state of the database at the time desired 
tables were locked. 

System Overview 

FIG. 2 depicts a block diagram of a system 200 that is 
used for producing a copy of desired tables in accordance 
with an embodiment of the present invention. 

As depicted in FIG. 2, a memory based planner 202 has 
been launched by a user to generate a planning schedule that 
is based upon a snapshot of database 204. Typically, gen- 
eration of a planning schedule will only require data from a 
subset of the tables contained in database 204. Therefore, a 
copy table list 218 is maintained which lists the database 
tables (1..N) in database 204 that are required in the gen- 
eration of the planning schedule (the "desired tables"). 

A coordinator 216 interfaces with individual worker 
processes, snapshot workers (206-214), loader workers 
(230-238) and delete workers(242-248), and is responsible 
for the overall coordination of producing a copy of database 
204 for use in generation of the planning schedule. 

Operation of System 200 

To produce a snapshot of database 204, data copy table 
218 is generated which identifies the desired set of data 
tables (1..N) in database 204. The desired set of tables listed 
in data copy table 218 are the tables that contain information 
that is needed by the memory based planner 202 in order to 
generate the planning schedule. Using the information con- 
tained in the data copy table 218, the coordinator 216 locks 
each desired table (1..N). The locks thus obtained prevent 
other processes from modifying the desired tables (1..N). 

Once the locks are obtained, the coordinator 216 requests 
a plurality of snapshot workers (206, 208, 210, 212, 214 and 
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215) to obtain a snapshot time from the database manage- 
ment system (DBMS). Snapshot times are assigned by the 
database management system, and are used by the database 
management system to determine what version of data to 

5 supply to processes. For example, the version of data items 
that are supplied to a process assigned a snapshot time of Ti 
will reflect the state of those data items as of time Tl , even 
if the data items have been subsequently modified. 
Snapshot times are used to guarantee that a particular 

10 version of data is given to a process (i.e. snapshot worker). 
Thus, if a process requests data that has been modified since 
obtaining its snapshot time, the DBMS reconstructs the state 
of data as of the snapshot time and supplies the reconstructed 
data to the process. 

15 For example, FIG. 2A illustrates the use of a snapshot 
time in accordance with an embodiment of the invention. In 
FIG. 2A, DB_A, DB_JB, DB_C, DB_D, DB_E and 
DB_F respectively represent different states of database 204 
at time TO, Tl, T2, 13, T4 and T5. If a snapshot time is 

20 issued at time TO and a read is subsequently performed at 
time T3, the DBMS will respond to the read by returning 
data as it existed in the database as of time TO (i.e. DB__A). 
The DBMS may achieve this using a variety of mechanisms. 
According to one embodiment, the DBMS uses log entries 

25 to reconstruct the state of the requested data items to reflect 
their state at time TO. In effect, the snapshot time enables a 
process to see a snapshot of a database as of a particular 
time. 

3Q One mechanism by which a DBMS can reconstruct a 
particular state of the database is described in detail in U.S 
Patent Applications "Method and Apparatus for Providing 
Isolation Levels in a Database", having Ser. No. 08/613,026, 
filed Mar. 11, 1996; "Techniques for Providing the Number 
of Snapshots of a Database", having Ser. No. 08/838,967, 
filed Apr. 23, 1997, and "Dynamic Snapshot Set 
Adjustment", having Ser. No. 08/841,541, filed Apr. 23, 
1997, which are hereby incorporated by reference. 
After the coordinator 216 requests the plurality of snap- 

40 shot workers (206, 208, 210, 212, 214 and 215) to obtain 
snapshot times, it then waits for a certain period of time 
(T_WAIT) for the snapshot times to be assigned to the 
snapshot workers by the DBMS. The snapshot times that the 
DBMS assigns to the snapshot workers approximately 

45 reflect the times at which the snapshot time assignments are 
made. Since the snapshot times are being assigned after the 
desired tables are locked, and before the locks on the desired 
tables are released, the snapshot times assigned to the 
snapshot workers will correspond to a time at which the 

5Q desired tables were locked. 

In certain embodiments, the snapshot workers obtain 
snapshot times by executing a "set transaction read only" 
command. Once a process issues a "set transaction read 
only" command to the DBMS, that process is ensured by the 

55 DBMS that all subsequent reads by that process for that 
transaction will return data that was in the database as of the 
time the "set transaction read only" command was issued. 

In certain database systems, using the "set transaction 
read only" command has certain limitations. Specifically, the 

60 process that issued the command cannot perform any data- 
base modifications (defined as any inserts, updates, or 
deletes). If it does perform database modifications, it loses 
the benefits of the command (i.e. loses its snapshot time). 
Because of this, it is not possible to use the "set transaction 

65 read only" command with a transaction that makes copies 
within a database of the relevant tables (since copying would 
involve writing into a database). 
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To circumvent this problem, the successful snapshot 240, any data that was previously stored in the particular 

workers read the data from the database, and then write the snapshot table (i.e. "stale data" from a prior execution of the 

data to one or more flat files outside of the database. Because planning software process 202) must first be removed. To 

writing to a fiat file is an operating system command, and not perform this task, a plurality of delete workers (242, 244, 

a DBMS command, it does not constitute a "write" operation 5 246 and 248) are used to delete previously stored snapshot 

as far as the DBMS is concerned. Hence, the snapshot times tables from sna p shot table database 240 

obtained through the "set transaction read only" command n *i_ • * l „ , 

remains in effect for each snapshot worker. u °* ce previously stored snapshot tables are deleted and 

When the (T_WAIT) time period expires, the coordinator the flat ^ *f ded * t0 ?"? 1D %Z 

216 releases its locks on the tables. At this time, some of the 1(1 sna P shot ^ble database 240, snapshot table database 240 

snapshot workers that were requested to obtain snapshot 10 contams a consistent snapshot of the desired tables identified 

times may not yet have been assigned snapshot times. To m data C0 Py table 218 as of a pabular point in time, 

ensure that all processes that participate in the creation of In a preferred embodiment, once the desired tables are 
copies of the relevant tables are seeing the same version of copied into the flat files, the memory based planner 202 

the relevant tables, only those snapshot workers that have reads the flat file information to generate a planning sched- 
been assigned snapshot times prior to the expiration of 15 ule. In an alternative embodiment, the memory based plan- 

T__WAIT (the "successful snapshot workers") are used in ner 202 waits for the fiat files to be copied into the snapshot 
retrieving data from database 204. For the purpose of tables in snapshot table database 240 and then reads the 

explanation, it shall be assumed that snapshot workers snapshot table information to generate a planning schedule. 
206-214 successfully obtained snapshot times, and that 

snapshot worker 215 did not. 20 Sequence for Producing a Copy of a Database 

Because the tables identified in the copy table list 218 a a ^a in a a- n * r ^ js 

. , , . 4 , ^ , j. i i. i FIG. 3 A and 3B are flow diagrams illustrating a method 

were locked at the time the successful snapshot workers ^„ ► „ n _ fJ . , . j 7 u 

• ... . . r « . , tor producing a copy of desired data from database 204 

were assigned their snapshot times, the successful snapshot A - , r V t c , u - « . 

. to A , *, 4 , ' . j * £ .» accordmg to an embodiment of the invention. For explana- 

workers ^guaranteed that .when they retrieve data from the tion the flow & of FIGS 3Aand 3 F fi m 

desired table, they will retrieve data which corresponds to a described ^ reference * components of FIG . 2 . 

single database state. „„ r 

For example, FIG. 2B illustrates locking database 204 to At St f ? 302 > based Pj~ 202 * l xec * ted ^ 

obtain snapshot times that correspond to a single database * f * r t0 e a fjf T § If u * based l U P°° 

state in accordance with an embodiment of the invention. In w d f a fro ? 204 * ste P the memory based 

FIG. 2B>, DB_A, DB_B and DB_C represent different 30 planner 202 causes a sna P shot worker 206 t0 be launched ' 

states of the relevant tables within database 204. In this At ste P 306 > the snapshot worker 206 determines which 

example, the desired tables in database 204 are locked by tables of database 204 are required by the memory based 

coordinator 216 at time Tl for a T_WAIT period of "4", planner 202 for generating the planning schedule. Using this 

and, therefore cannot be updated by another process until 35 information, the snapshot worker 206 generates a copy table 

T5. The coordinator 216 maintains its locks on the desired nst 21 **. 

tables in database 204 until T5. Thus, if snapshot workers In one embodiment, each entry in the copy table list 218 

206, 208 and 210 obtain snapshot times at Tl, T3 and T4, includes a table ID 250, a delete start time 252 and a delete 

they will all retrieve data which corresponds to a single complete time 254. The table ID 250 identifies a particular 

database state (DB__B) from the desired tables. 4Q table in database 204 that is required by the memory based 

In one embodiment, the successful snapshot workers are planner 202 for the generation of the planing schedule. The 

not required to wait until the end of the T_WAIT period of delete start time 252 identifies a system timestamp in which 

time before they begin to retrieve data from database 204. a deiete worker began to delete a corresponding snapshot 

For example, referring to FIG. 2B, if a snapshot worker table (to remove "stale data") in snapshot table database 240, 

obtains a snapshot time at T2, the snapshot worker may 45 as identified by the particular table ID. For example, entry 

begin to retrieve data from the desired tables before the 256 of copy table list 218 indicates that a delete worker 

tables are unlocked at T5. began to delete snapshot table 15 in snapshot table database 

Using the snapshot times, the successful snapshot workers 240 at ±e s y stem *™ of 9oa 

(206, 208, 210, 212 and 214) retrieve data from database 204 The delete complete time 254 identifies a system times- 

and copy it into a plurality of flat files (220, 22, 224, 226 and 50 temp ui which a delete worker completed the deletion of a 

228). The coordinator 216 is responsible for coordinating the snapshot table in snapshot table database 240, as identified 

particular data (i.e. the tables identified in data copy table by the particular table ID. For example, entry 256 of copy 

218) that is retrieved by each of the successful snapshot table list 218 indicates that a delete worker completed the 

workers (206, 208, 210, 212 and 214). deletion of snapshot table 15 in snapshot table database 240 

When any flat file of the plurality of flat files (220, 22, 55 at tne system time of 950. The delete start time 252 and 

224, 226 and 228) has been completed, one of a plurality of delete complete time 254 entries are used by the coordinator 

loader workers (230, 232, 234, 236 and 238) load the flat 2i6 to determine whether data can be loaded into a particular 

files into one of a plurality of snapshot tables (1..N) in snapshot table (1..N). 

snapshot table database 240. The plurality of snapshot tables At step 308, the snapshot worker 206 spawns a coordi- 

(1..N) in snapshot table database 240 are mapped on a 60 nator process 216 for coordinating the copying of data from 

one-to-one basis with the desired database tables (1..N) in database 204. At step 309, the snapshot worker 206 notifies 

database 204. In certain embodiments, the loader workers the coordinator 216 that the copy table list 218 has been 

are SQL loaders that write data from the flat files directly created. At step 310, the coordinator 216 obtains locks on the 

into the snapshot tables (l.JN) in the snapshot table database tables identified in the copy table list 218. 

2^0- 65 At step 312, after obtaining a lock on each table identified 

Before the data from a flat file thus created can be loaded in copy table list 218, the coordinator 216 requests a 

into a particular snapshot table in snapshot table database plurality of snapshot workers (206-214) to solicit snapshot 
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times from the DBMS. In one embodiment, upon executing fiat files, informing the coordinator 216 each time one of the 

a memory based planner, the user may specify a particular flat files has been completed. 

number of snapshot workers for the coordinator 216 to At step 324, the coordinator 216 determines whether the 

request to solicit snapshot times. delete workers (242-248) have deleted the snapshot tables 

At step 314, the coordinator waits a certain period of time 5 (i.e. "stale data") in the snapshot table database 240 that 

(T_WA1T) for the requested snapshot workers (206-214) to correspond to the data that the snapshot worker has just 

be assigned snapshot times. For example, in one completed copying. For example, if the snapshot worker 

embodiment, the coordinator 216 waits several minutes after informs the coordinator 216 that database tables "1", "5", 

requesting the plurality of snapshot workers (206-214) to and "8" in database 204 have just been copied to the flat file 
solicit snapshot times from the DBMS. 10 222, the coordinator 216 determines whether the delete 

At step 316, after the (T_WAIT) period of time expires, workers (242-248) have removed the "stale data" by delet- 

the coordinator releases the locks on the tables identified in i°g snapshot tables "1", "5" and "8" in snapshot table 

copy table list 218. database 240. 

At step 318, the coordinator 216 uses the copy table list In certain embodiments, the coordinator 216 uses the 
218 to assign the successful snapshot workers (in this 15 information in the copy table list 218 to determine if a 

example, snapshot workers 206-214), a particular set of data particular snapshot table has been deleted. In one 

to copy from database 204. For example, snapshot worker embodiment, the coordinator 216 uses the delete complete 

206 may be assigned to copy database table "17" from time 254 parameter to determine if a snapshot table has been 

database 204. Once assigned a set of data, the successful deleted. 

snapshot workers (206-214) begin to copy the assigned data 20 If at step 324 the coordinator 216 determines that a 

from database 204 into flat files (220-228). snapshot table has not been deleted (i.e. snapshot tables "1", 

In certain embodiments, the coordinator 216 assigns the "5" and "8" for the previous example), then at step 326, the 

snapshot workers a particular set of data to copy before the coordinator 216 delays the launching (e.g. spawning) of a 

(T_WAIT) period of time expires. For example, the coor- 2$ loader worker (230-238) for loading the information until 

dinator could assign a particular copying task at the same the snapshot tables "1", "5" and "8" have been deleted. For 

time that the coordinator requests the snapshot workers to example, assuming that snapshot worker 206 has notified 

obtain snapshot times. Without waiting to the T_WAIT coordinator 216 that it has copied database tables "1", "5" 

period to expire, each snapshot worker can begin the copy- and "8" from database 204 into flat file 220, if coordinator 

ing task as soon as (1) it is assigned the set of data copy, and 3Q 216 determines that snapshot table "5*' in snapshot table 

(2) it has successfully obtained a snapshot time. database 240 has not been deleted (i.e. the "stale data" has 

At step 320, the coordinator launches delete workers not been removed), then coordinator 216 will delay launch- 
(242-248) to delete "stale data" contained in the snapshot loader worker 230 for loading the information into 
tables in snapshot table database 240. In deleting the "stale snapshot tables "1", "5" and "8", until snapshot table "5" has 
data", the delete workers use the copy table list 218 to 35 been deleted. Control then proceeds to step 338. 
identify which snapshot tables in snapshot table database If at step 324, the coordinator 216 determines that the 
240 need to be deleted. In one embodiment, the delete snapshot table has been deleted, then at step 328 the coor- 
workers (242-248) respectively update the delete start time dinator 216 launches a loader worker (230-238) to load the 
252 and delete complete time 254 parameters for the cor- information from the flat file into the corresponding snap- 
responding entry as they begin and complete the deletion of ^ shot table in snapshot table database 240. 
a particular table identified in copy table list 218. At step 330, when a loader worker (230-238) finishes 

For example, the delete worker 242 may use the infor- loading a flat file (220-228) into its corresponding snapshot 

mation at entry 258 of copy table list 218 to identify a file m snapshot table database 240, it notifies the coordinator 

snapshot table "7" in snapshot table database 240 as a table 216. In certain embodiments, the loader workers (230-238) 

that needs to be deleted. The delete worker 242 then stores 45 notify the coordinator 216 of the particular snapshot table in 

the current system time ("1000" in this example) in the which the data was loaded. 

delete start time 252 parameter of entry 258 and begins At step 332, the coordinator 216 detennines whether all 

deleting the snapshot table "7". When delete worker 242 the desired tables identified in copy table list 218 have been 

completes the deletion of snapshot table "7", it stores the copied from database 204 into the flat files (220-228). 

current system time ("1130" in this example) in the delete 50 If at step 332 the coordinator 216 determines that all the 

complete time 254 parameter of entry 258. tables identified in copy table list 218 have been copied, then 

It should be noted that although the delete workers at step 334, the coordinator 216 notifies the memory based 

(242-248) are launched at step 320 in this example, the planner 202 that the flat files (220-228) contain a valid copy 

coordinator 216 may actually launch them prior to or even of the desired tables from database 204. At step 336, the 

after step 320. For example, in one embodiment, the coor- 55 memory based planner 202 uses the information in the flat 

dinator 216 launches the delete workers (242-248) imme- files (220-228) to generate the planning schedule, 

diately after it is notified that the copy table list 218 has been In an alternative embodiment, at step 332, the coordinator 

created in step 308. 216 determines whether all the desired tables identified in 

At step 322, a snapshot worker notifies the coordinator copy table list 218 have been copied from database 204 into 

216 it has finished copying the data it was assigned. In one 60 the snapshot tables in snapshot table database 240. If the 

embodiment, the snapshot worker informs the coordinator coordinator 216 determines that all the tables identified in 

216 of the data that was copied (i.e. the particular database copy table list 218 have been copied, then at step 334, the 

tables in database 204) and the location of the one or more coordinator 216 notifies the memory based planner 202 that 

flat files that contains the copy of the data. Although FIG. 2 the snapshot table database 240 contains a valid copy of the 

depicts each snapshot worker using a single flat file for 65 desired tables from database 204. At step 336, the memory 

copying its assigned data, in certain embodiments, the based planner 202 then uses the snapshot tables in snapshot 

snapshot workers may copy their assigned data to multiple table database 240 to generate the planning schedule. 
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If at step 332 the coordinator 216 determines that all the 
tables identified in copy table list 218 have not been copied, 
then at step 338, the coordinator 216 continues to accept and 
process other completion notifications from snapshot work- 
ers (206-214). When the coordinator 216 receives a notifi- 
cation that a snapshot worker (206-214) has finished copy- 
ing its assigned data to a particular flat file, control proceeds 
to step 324 to determine whether the delete workers 
(242-248) have deleted the corresponding snapshot tables in 
the snapshot table database 240. 

Producing a Copy without Locking Database 
Tables 

In the embodiment described above, the desired tables are 
locked for a relatively brief period T_WAIT while snapshot 
workers obtain snapshot times. In an alternative 
embodiment, a mechanism is provided for producing a copy 
of desired tables from a database without locking the desired 
tables. For example, in one embodiment, the following steps 
are performed to obtain a snapshot of a database without 
locking the desired tables: 

(1) the desired tables are identified; 

(2) a single snapshot worker process is requested to obtain 
a snapshot time; 

(3) using the single snapshot worker, a copy of desired 
tables is produced which reflects the state of the database 
associated with the snapshot time. 

FIG. 4 depicts a block diagram of a system 400 that is 
used for producing a copy of a database in accordance with 
an embodiment of the present invention. FIG. 4 is similar to 
FIG. 2, and therefore like components have been numbered 
alike. 

The system 400 differs from that shown in FIG. 2 in that 
system 400 includes a single snapshot worker 206 which is 
used to retrieve data from database 204. By using a single 
snapshot worker 206, only a single snapshot time is 
required. Because only a single snapshot time is required, 
the coordinator 216 is not required to lock the database 
tables in database 204. 

FIG. 5 A and 5B are flow diagrams illustrating a method 
for producing a copy of desired data from database 204, 
without locking the database tables that contain the desired 
data, according to an embodiment of the invention. For 
explanation purposes, the flow diagrams of FIGS. 5 A and 5B 
are described with reference to the components of FIG. 4. 

At step 502, a memory based planner 202 is executed by 
a user to generate a planning schedule that is based upon 
data from database 204. At step 504, the memory based 
planner 202 causes a snapshot worker 206 to be launched. 

At step 506, the snapshot worker 206 determines which 
tables of database 204 are required by the memory based 
planner 202 for generating the planning schedule. Using this 
information, the snapshot worker 206 generates a copy table 
list 218. 

In one embodiment, each entry in the copy table list 218 
includes a table ID 250, a delete start time 252 and a delete 
complete time 254. The table ID 250 identifies a particular 
table in database 204 that is required by the memory based 
planner 202 for the generation of the planing schedule. The 
delete start time 252 identifies a system time stamp in which 
a delete worker began to delete a corresponding snapshot 
table (to remove "stale data") in snapshot table database 240, 
as identified by the particular table ID. For example, entry 
256 of copy table list 218 indicates that a delete worker 
began to delete snapshot table 15 in snapshot table database 
240 at the system time of 900. 
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The delete complete time 254 identifies a system time 
stamp in which a delete worker completed the deletion of a 
snapshot table in snapshot table database 240, as identified 
by the particular table ID. For example, entry 256 of copy 

5 table list 218 indicates that a delete worker completed the 
deletion of snapshot table 15 in snapshot table database 240 
at the system time of 950. The delete start time 252 and 
delete complete time 254 entries are used by the coordinator 
216 to determine whether data can be loaded into a particular 

10 snapshot table (1..N). 

At step 508, the snapshot worker 206 spawns a coordi- 
nator process 216 for coordinating the copying of data from 
database 204. At step 509, the snapshot worker 206 notifies 
the coordinator 216 that the copy table list 218 has been 

15 created. 

At step 510, the coordinator 216 requests the snapshot 
workers 206 to solicit a snapshot times from the DBMS. 

At step 512, the coordinator waits for the snapshot worker 
206 to be assigned a snapshot time. 

20 At step 514, the coordinator 216 uses the copy table list 
218 to assign the snapshot worker 206, a particular set of 
data to copy from database 204. For example, snapshot 
worker 206 may be assigned to copy database table "17" 

25 from database 204. Once assigned a set of data, the snapshot 
worker 206 begins to copy the assigned data from database 
204 into flat files (220-226). 

At step 516, the coordinator launches delete workers 
(242-248) to delete "stale data" contained in the snapshot 

30 tables in snapshot table database 240. In deleting the "stale 
data", the delete workers use the copy table list 218 to 
identify which snapshot tables in snapshot table database 
240 need to be deleted. In one embodiment, the delete 
workers (242-248) respectively update the delete start time 

q5 252 and delete complete time 254 parameters for the cor- 
responding entry as they begin and complete the deletion of 
a particular table identified in copy table list 218. 

For example, the delete worker 242 may use the infor- 
mation at entry 258 of copy table list 218 to identify a 

40 snapshot table "7" in snapshot table database 240 as a table 
that needs to be deleted. The delete worker 242 then stores 
the current system time ("1000" in this example) in the 
delete start time 252 parameter of entry 258 and begins 
deleting the snapshot table "7". When delete worker 242 

45 completes the deletion of snapshot table "7", it stores the 
current system time ("1130" in this example) in the delete 
complete time 254 parameter of entry 258. 

It should be noted that although the delete workers 
(242-248) are launched at step 516 in this example, the 

50 coordinator 216 may actually launch them prior to or even 
after step 516. For example, in one embodiment, the coor- 
dinator 216 launches the delete workers (242-248) imme- 
diately after it is notified that the copy table list 218 has been 
created in step 508. 

55 At step 518, the snapshot worker 206 notifies the coor- 
dinator 216 it has finished copying the data it was assigned. 
In one embodiment, the snapshot worker informs the coor- 
dinator 216 of the data that was copied (i.e. the particular 
database tables in database 204) and the location of the one 

60 or more flat files that contains the copied data. As illustrated 
by flat files (220-226), snapshot worker 206 may copy its 
assigned data to multiple flat files, informing the coordinator 
216 each time one of the flat files has been completed. 
At step 520, the coordinator 216 determines whether the 

65 delete workers (242-248) have deleted the snapshot tables 
(i.e. removed the "stale data") in the snapshot table database 
240 that correspond to the data that the snapshot worker has 



